Tautomer Identification and Tautomer Structure Generation Based on the InChI Code

نویسندگان

  • Torsten Thalheim
  • Armin Vollmer
  • Ralf-Uwe Ebert
  • Ralph Kühne
  • Gerrit Schüürmann
چکیده

An algorithm is introduced that enables a fast generation of all possible prototropic tautomers resulting from the mobile H atoms and associated heteroatoms as defined in the InChI code. The InChI-derived set of possible tautomers comprises (1,3)-shifts for open-chain molecules and (1,n)-shifts (with n being an odd number >3) for ring systems. In addition, our algorithm includes also, as extension to the InChI scope, those larger (1,n)-shifts that can be constructed from joining separate but conjugated InChI sequences of tautomer-active heteroatoms. The developed algorithm is described in detail, with all major steps illustrated through explicit examples. Application to approximately 72,500 organic compounds taken from EINECS (European Inventory of Existing Commercial Chemical Substances) shows that around 11% of the substances occur in different heteroatom-prototropic tautomeric forms. Additional QSAR (quantitative structure-activity relationship) predictions of their soil sorption coefficient and water solubility reveal variations across tautomers up to more than two and 4 orders of magnitude, respectively. For a small subset of nine compounds, analysis of quantum chemically predicted tautomer energies supports the view that among all tautomers of a given compound, those restricted to H atom exchanges between heteroatoms usually include the thermodynamically most stable structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of the InChI in cheminformatics with the CDK and Bioclipse

BACKGROUND The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. RESULTS We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To mak...

متن کامل

A Theoretical Study on the Aromaticity of 5-methylcytosine tautomers in the gas phase

The armaticity of 5 methylcytosine tautomers in the gas phase has been studied and the chemical structures of related tautomers are investigated . The electronic energy, enthalpy and free energy of each tautomer are also estimated at the B3LYP/6-31 G* // B3LYP/6-31 G* and MP2 / 6-31 G* // MP2 / 6-31 G* Levels

متن کامل

Distinguishing tautomerism in the crystal structure of (Z)-N-(5-ethyl-2,3-di­hydro-1,3,4-thia­diazol-2-yl­idene)-4-methyl­benzene­sulfonamide using DFT-D calculations and 13C solid-state NMR

The crystal structure of the title compound, C11H13N3O2S2, has been determined previously on the basis of refinement against laboratory powder X-ray diffraction (PXRD) data, supported by comparison of measured and calculated (13)C solid-state NMR spectra [Hangan et al. (2010). Acta Cryst. B66, 615-621]. The molecule is tautomeric, and was reported as an amine tautomer [systematic name: N-(5-eth...

متن کامل

The Impact of Tautomer Forms on Pharmacophore-Based Virtual Screening

In the field of in silico screening, many applications do not automatically consider possible tautomeric states of molecules. However, the detection of new compound candidates might rely on correct structural description, which is important for the perfect fit toward the biologically relevant interactions. In this paper, we present a new exhaustive tautomer enumeration approach implemented by m...

متن کامل

Lagerkvist versus Crick

Present day data allow significant reconsideration of ideas on mechanisms underlying the degeneracy in the genetic code. Here a hypothesis is presented which links the degeneracy to possible conformational alterations in the codon-anticodon duplex. This enables explanation of Rumer symmetry in the table of the genetic code, coding of methionine and tryptophane without degeneracy and even predic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 50 7  شماره 

صفحات  -

تاریخ انتشار 2010